56 research outputs found

    Integrative modeling identifies genetic ancestry-associated molecular correlates in human cancer

    Get PDF
    Cellular and molecular aberrations contribute to the disparity of human cancer incidence and etiology between ancestry groups. Multiomics profiling in The Cancer Genome Atlas (TCGA) allows for querying of the molecular underpinnings of ancestry-specific discrepancies in human cancer. Here, we provide a protocol for integrative associative analysis of ancestry with molecular correlates, including somatic mutations, DNA methylation, mRNA transcription, miRNA transcription, and pathway activity, using TCGA data. This protocol can be generalized to analyze other cancer cohorts and human diseases. For complete details on the use and execution of this protocol, please refer to Carrot-Zhang et al. (2020)

    Comparison of Breast Cancer Molecular Features and Survival by African and European Ancestry in The Cancer Genome Atlas

    Get PDF
    Importance: African Americans have the highest breast cancer mortality rate. Although racial difference in the distribution of intrinsic subtypes of breast cancer is known, it is unclear if there are other inherent genomic differences that contribute to the survival disparities. Objectives: To investigate racial differences in breast cancer molecular features and survival and to estimate the heritability of breast cancer subtypes. Design, Setting, and Participants: Among a convenience cohort of patients with invasive breast cancer, breast tumor and matched normal tissue sample data (as of September 18, 2015) were obtained from The Cancer Genome Atlas. Main Outcomes and Measures: Breast cancer–free interval, tumor molecular features, and genetic variants. Results: Participants were 930 patients with breast cancer, including 154 black patients of African ancestry (mean [SD] age at diagnosis, 55.66 [13.01] years; 98.1% [n = 151] female) and 776 white patients of European ancestry (mean [SD] age at diagnosis, 59.51 [13.11] years; 99.0% [n = 768] female). Compared with white patients, black patients had a worse breast cancer-free interval (hazard ratio, HR=1.67; 95% CI, 1.02-2.74; P = .043). They had a higher likelihood of basal-like (odds ratio, 3.80; 95% CI, 2.46-5.87; P < .001) and human epidermal growth factor receptor 2 (ERBB2 [formerly HER2])–enriched (odds ratio, 2.22; 95% CI, 1.10-4.47; P = .027) breast cancer subtypes, with the Luminal A subtype as the reference. Blacks had more TP53 mutations and fewer PIK3CA mutations than whites. While most molecular differences were eliminated after adjusting for intrinsic subtype, the study found 16 DNA methylation probes, 4 DNA copy number segments, 1 protein, and 142 genes that were differentially expressed, with the gene-based signature having an excellent capacity for distinguishing breast tumors from black vs white patients (cross-validation C index, 0.878). Using germline genotypes, the heritability of breast cancer subtypes (basal vs nonbasal) was estimated to be 0.436 (P = 1.5 × 10−14). The estrogen receptor–positive polygenic risk score built from 89 known susceptibility variants was higher in blacks than in whites (difference, 0.24; P = 2.3 × 10−5), while the estrogen receptor–negative polygenic risk score was much higher in blacks than in whites (difference, 0.48; P = 2.8 × 10−11). Conclusions and Relevance: On the molecular level, after adjusting for intrinsic subtype frequency differences, this study found a modest number of genomic differences but a significant clinical survival outcome difference between blacks and whites in The Cancer Genome Atlas data set. Moreover, more than 40% of breast cancer subtype frequency differences could be explained by genetic variants. These data could form the basis for the development of molecular targeted therapies to improve clinical outcomes for the specific subtypes of breast cancers that disproportionately affect black women. Findings also indicate that personalized risk assessment and optimal treatment could reduce deaths from aggressive breast cancers for black women

    Before and After: Comparison of Legacy and Harmonized TCGA Genomic Data Commons’ Data

    Get PDF
    We present a systematic analysis of the effects of synchronizing a large-scale, deeply characterized, multi-omic dataset to the current human reference genome, using updated software, pipelines, and annotations. For each of 5 molecular data platforms in The Cancer Genome Atlas (TCGA)—mRNA and miRNA expression, single nucleotide variants, DNA methylation and copy number alterations—comprehensive sample, gene, and probe-level studies were performed, towards quantifying the degree of similarity between the ‘legacy’ GRCh37 (hg19) TCGA data and its GRCh38 (hg38) version as ‘harmonized’ by the Genomic Data Commons. We offer gene lists to elucidate differences that remained after controlling for confounders, and strategies to mitigate their impact on biological interpretation. Our results demonstrate that the hg19 and hg38 TCGA datasets are very highly concordant, promote informed use of either legacy or harmonized omics data, and provide a rubric that encourages similar comparisons as new data emerge and reference data evolve. Gao et al. performed a systematic analysis of the effects of synchronizing the large-scale, widely used, multi-omic dataset of The Cancer Genome Atlas to the current human reference genome. For each of the five molecular data platforms assessed, they demonstrated a very high concordance between the ‘legacy’ GRCh37 (hg19) TCGA data and its GRCh38 (hg38) version as ‘harmonized’ by the Genomic Data Commons

    DNA defects, epigenetics, and gene expression in cancer-adjacent breast: A study from the cancer genome atlas

    Get PDF
    Recurrence rates after breast-conserving therapy may depend on genomic characteristics of cancer-adjacent, benign-appearing tissue. Studies have not evaluated recurrence in association with multiple genomic characteristics of cancer-adjacent breast tissue. To estimate the prevalence of DNA defects and RNA expression subtypes in cancer-adjacent, benign-appearing breast tissue at least 2 cm from the tumor margin, cancer-adjacent, pathologically well-characterized, benign-appearing breast tissue specimens from The Cancer Genome Atlas project were analyzed for DNA sequence, copy-number variation, DNA methylation, messenger RNA (mRNA) sequence, and mRNA/microRNA expression. Additional samples were also analyzed by at least one of these genomic data types and associations between genomic characteristics of normal tissue and overall survival were assessed. Approximately 40% of cancer-adjacent, benign-appearing tissues harbored genomic defects in DNA copy number, sequence, methylation, or in RNA sequence, although these defects did not significantly predict 10-year overall survival. Two mRNA/microRNA expression phenotypes were observed, including an active mRNA subtype that was identified in 40% of samples. Controlling for tumor characteristics and the presence of genomic defects, this active subtype was associated with significantly worse 10-year survival among estrogen receptor (ER)-positive cases. This multi-platform analysis of breast cancer-adjacent samples produced genomic findings consistent with current surgical margin guidelines, and provides evidence that extratumoral RNA expression patterns in cancer-adjacent tissue predict overall survival among patients with ER-positive disease

    An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics

    Get PDF
    For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types

    Comprehensive Molecular Characterization of Muscle-Invasive Bladder Cancer

    Get PDF
    We report a comprehensive analysis of 412 muscle-invasive bladder cancers characterized by multiple TCGA analytical platforms. Fifty-eight genes were significantly mutated, and the overall mutational load was associated with APOBEC-signature mutagenesis. Clustering by mutation signature identified a high-mutation subset with 75% 5-year survival. mRNA expression clustering refined prior clustering analyses and identified a poor-survival “neuronal” subtype in which the majority of tumors lacked small cell or neuroendocrine histology. Clustering by mRNA, long non-coding RNA (lncRNA), and miRNA expression converged to identify subsets with differential epithelial-mesenchymal transition status, carcinoma in situ scores, histologic features, and survival. Our analyses identified 5 expression subtypes that may stratify response to different treatments. A multiplatform analysis of 412 muscle-invasive bladder cancer patients provides insights into mutational profiles with prognostic value and establishes a framework associating distinct tumor subtypes with clinical options

    Comprehensive Analysis of Genetic Ancestry and Its Molecular Correlates in Cancer

    Get PDF
    We evaluated ancestry effects on mutation rates, DNA methylation, and mRNA and miRNA expression among 10,678 patients across 33 cancer types from The Cancer Genome Atlas. We demonstrated that cancer subtypes and ancestry-related technical artifacts are important confounders that have been insufficiently accounted for. Once accounted for, ancestry-associated differences spanned all molecular features and hundreds of genes. Biologically significant differences were usually tissue specific but not specific to cancer. However, admixture and pathway analyses suggested some of these differences are causally related to cancer. Specific findings included increased FBXW7 mutations in patients of African origin, decreased VHL and PBRM1 mutations in renal cancer patients of African origin, and decreased immune activity in bladder cancer patients of East Asian origin

    The Integrated Genomic Landscape of Thymic Epithelial Tumors

    Get PDF
    Thymic epithelial tumors (TETs) are one of the rarest adult malignancies. Among TETs, thymoma is the most predominant, characterized by a unique association with autoimmune diseases, followed by thymic carcinoma, which is less common but more clinically aggressive. Using multi-platform omics analyses on 117 TETs, we define four subtypes of these tumors defined by genomic hallmarks and an association with survival and World Health Organization histological subtype. We further demonstrate a marked prevalence of a thymoma-specific mutated oncogene, GTF2I, and explore its biological effects on multi-platform analysis. We further observe enrichment of mutations in HRAS, NRAS, and TP53. Last, we identify a molecular link between thymoma and the autoimmune disease myasthenia gravis, characterized by tumoral overexpression of muscle autoantigens, and increased aneuploidy. Radovich et al. perform multi-platform analyses of thymic epithelial tumors. They identify high prevalence of GTF2I mutations and enrichment of mutations in HRAS, NRAS, and TP53 and link overexpression of muscle autoantigens and increased aneuploidy in thymoma and patients’ risk of having myasthenia gravis

    Integrated Molecular Characterization of Testicular Germ Cell Tumors

    Get PDF
    We studied 137 primary testicular germ cell tumors (TGCTs) using high-dimensional assays of genomic, epigenomic, transcriptomic, and proteomic features. These tumors exhibited high aneuploidy and a paucity of somatic mutations. Somatic mutation of only three genes achieved significance—KIT, KRAS, and NRAS—exclusively in samples with seminoma components. Integrated analyses identified distinct molecular patterns that characterized the major recognized histologic subtypes of TGCT: seminoma, embryonal carcinoma, yolk sac tumor, and teratoma. Striking differences in global DNA methylation and microRNA expression between histology subtypes highlight a likely role of epigenomic processes in determining histologic fates in TGCTs. We also identified a subset of pure seminomas defined by KIT mutations, increased immune infiltration, globally demethylated DNA, and decreased KRAS copy number. We report potential biomarkers for risk stratification, such as miRNA specifically expressed in teratoma, and others with molecular diagnostic potential, such as CpH (CpA/CpC/CpT) methylation identifying embryonal carcinomas. Shen et al. identify molecular characteristics that classify testicular germ cell tumor types, including a separate subset of seminomas defined by KIT mutations. This provides a set of candidate biomarkers for risk stratification and potential therapeutic targeting

    Comprehensive and Integrated Genomic Characterization of Adult Soft Tissue Sarcomas

    Get PDF
    Sarcomas are a broad family of mesenchymal malignancies exhibiting remarkable histologic diversity. We describe the multi-platform molecular landscape of 206 adult soft tissue sarcomas representing 6 major types. Along with novel insights into the biology of individual sarcoma types, we report three overarching findings: (1) unlike most epithelial malignancies, these sarcomas (excepting synovial sarcoma) are characterized predominantly by copy-number changes, with low mutational loads and only a few genes (, , ) highly recurrently mutated across sarcoma types; (2) within sarcoma types, genomic and regulomic diversity of driver pathways defines molecular subtypes associated with patient outcome; and (3) the immune microenvironment, inferred from DNA methylation and mRNA profiles, associates with outcome and may inform clinical trials of immune checkpoint inhibitors. Overall, this large-scale analysis reveals previously unappreciated sarcoma-type-specific changes in copy number, methylation, RNA, and protein, providing insights into refining sarcoma therapy and relationships to other cancer types
    • 

    corecore